In-memory URL Compression
نویسندگان
چکیده
A common problem of large scale search engines and web spiders is how to handle a huge number of encountered URLs. Traditional search engines and web spiders use hard disk to store URLs without any compression. This results in slow performance and more space requirement. This paper describes a simple URL compression algorithm allowing efficient compression and decompression. The compression algorithm is based on a delta encoding scheme to extract URLs sharing common prefixes and an AVL tree to get efficient search speed. Our results show that the 50% of size reduction is achieved.
منابع مشابه
In-memory URL Compression using AVL Tree
A common problem of large scale search engines and web spiders is how to handle a huge number of encountered URLs. Traditional search engines and web spiders use hard disk to store URLs without any compression. This results in slow performance and more space requirement. This paper describes a simple URL compression algorithm allowing efficient compression and decompression. The compression alg...
متن کاملCompression Analysis of Hollow Cylinder Basalt Continuous Filament Epoxy Composite Filled with Shape Memory Wire
This paper presents an experimental investigation into the compression behavior of shape memory alloy hybrid composites (SMAHC) subjected to quasi-static loading taking into account of rotation effects of shape memory wire in basalt continuous filament (BCF) direct roving epoxy composite. Two types of specimen prepared, the BCF direct roving reinforced epoxy composite filled with shape memory w...
متن کاملImplementation of VlSI Based Image Compression Approach on Reconfigurable Computing System - A Survey
Image data require huge amounts of disk space and large bandwidths for transmission. Hence, imagecompression is necessary to reduce the amount of data required to represent a digital image. Thereforean efficient technique for image compression is highly pushed to demand. Although, lots of compressiontechniques are available, but the technique which is faster, memory efficient and simple, surely...
متن کاملURL Forwarding and Compression in Adaptive Web Caching
Web caching is generally acknowledged as an important service for alleviating focused overloads when certain web servers’ contents suddenly become popular. Cooperative caching systems are more effective than independent caches due to the larger collective backing store that cooperation creates. One such system currently being developed at UCLA, Adaptive Web Caching (AWC), uses an application-le...
متن کاملPrioritize the ordering of URL queue in Focused crawler
The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001